ABOUT THE DATA

The original data contained 5 columns

  1. ts-> Time Stamp
  2. received_ts-> Time stamp of sensor receiving data
  3. device_uuid-> Device's unique user ID
  4. data_item_name-> The Coodinate system to which the data belong
  5. value-> A value Describing the value for the particular coodinate

After reviewing the data, and the ploblem statement, the key takeaway is that it belongs to the gyroscope sensor. The data is taken from a single device.

In [1]:
# module for data manipulation
import pandas as pd
In [3]:
# reading the data
data=pd.read_csv("ScooterIMUData.csv")

# first five tables of the data
data.head()
Out[3]:
ts received_ts device_uuid data_item_name value
0 2021-11-10 12:18:47.150 2021-11-10 12:18:56.856 s_5777 GYR_X_DEG -0.010051
1 2021-11-10 12:18:47.150 2021-11-10 12:18:56.860 s_5777 GYR_Y_DEG -0.076319
2 2021-11-10 12:18:47.150 2021-11-10 12:18:56.865 s_5777 GYR_Z_DEG -0.044205
3 2021-11-10 12:18:47.170 2021-11-10 12:18:56.856 s_5777 GYR_X_DEG -0.029533
4 2021-11-10 12:18:47.170 2021-11-10 12:18:56.860 s_5777 GYR_Y_DEG -0.014959
In [4]:
# gathering information about the data
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 334467 entries, 0 to 334466
Data columns (total 5 columns):
 #   Column          Non-Null Count   Dtype  
---  ------          --------------   -----  
 0   ts              334467 non-null  object 
 1   received_ts     334467 non-null  object 
 2   device_uuid     334467 non-null  object 
 3   data_item_name  334467 non-null  object 
 4   value           334467 non-null  float64
dtypes: float64(1), object(4)
memory usage: 12.8+ MB
In [5]:
data.describe()
Out[5]:
value
count 334467.000000
mean -0.003939
std 0.087412
min -1.803581
25% -0.018011
50% -0.004617
75% 0.008059
max 1.552032
In [6]:
print("Number of rows are {} and number of columns are {}".format(str(data.shape[0]),str(data.shape[1])))
Number of rows are 334467 and number of columns are 5
In [7]:
# checking for Null values 
data.isnull().sum()
Out[7]:
ts                0
received_ts       0
device_uuid       0
data_item_name    0
value             0
dtype: int64

Data is complete

In [8]:
# dropping duplicate values within the data 
print(" Number of rows present in data before dropping duplicate values:-",data.shape[0])

# using drop_duplicates function from pandas library
data=data.drop_duplicates()

print(" Number of rows present in data after dropping duplicate values:-",data.shape[0])
 Number of rows present in data before dropping duplicate values:- 334467
 Number of rows present in data after dropping duplicate values:- 334466

Assumptions made for Gyroscope

  1. The gyroscope is placed vertically, i.e.. the z axis points towards the headlight of the scooter. The rotations of the Handle bar will be represented by the Y axis and and X axis wil determine the upwards and downwards movement.

  2. The clockwise rotation is -ve and anticlockwise rotation is +ve

Gyroscope.jpg

Since we have a lot of information in our time stamps so let's simply the dataset to extract much information from the columns

Dividing the time stamp to seperate date and time columns

Also adding a new column of timestamp difference to measure the activation time duration of the sensor.

We are converting the time stamp to a single unit measured in seconds

In [9]:
# Function to convert the time stamp in HR:MIN:SS format to seconds
def seconds(str1):
  
  # Split the time stamp to seperate values
  # The output of the line will return a list containing 
  # the First element time in hour
  # the second element time in the minutes
  # the third one the seconds
  str1=list(map(float,str1.split(":")))
  
  # convert the whole to seconds
  second=str1[0]*3600+str1[1]*60+str1[2]
  
  # return
  return second
    
# lists to store the data for new columns
# list to store the timestamp date
ts_date=[]

# list to store the received timestamp date
received_ts_date=[]

# list to store the timestamp
timestamp=[]

# list to store the received timestamp
received_timestamp=[]

# list to store the difference of time stamps
difference=[]

# iterate through each row of the dataset
for i in range(data.shape[0]):

  # add the data items to the list
  # using string indexing to get the required data from the string
  received_ts_date.append(data.iloc[i].received_ts[:10])
  ts_date.append(data.iloc[i].ts[:10])

  # get the time in seconds
  t1=seconds(data.iloc[i].ts[11:])
  t2=seconds(data.iloc[i].received_ts[11:])

  # adding the timestamp lists
  timestamp.append(t1)
  received_timestamp.append(t2)
  difference.append(int(t2-t1))

# creating new columns in the dataset
data["ts_date"]=ts_date
data["received_ts_date"]=received_ts_date
data["timestamp"]=timestamp
data["received_timestamp"]=received_timestamp
data['difference']=difference

# removing time stamp and received time stamp columns from the dataset as their 
# information is distributed to new columns now
data=data.drop(["ts","received_ts"],axis=1)

# top 5 rows of new dataset
data.head()
Out[9]:
device_uuid data_item_name value ts_date received_ts_date timestamp received_timestamp difference
0 s_5777 GYR_X_DEG -0.010051 2021-11-10 2021-11-10 44327.15 44336.856 9
1 s_5777 GYR_Y_DEG -0.076319 2021-11-10 2021-11-10 44327.15 44336.860 9
2 s_5777 GYR_Z_DEG -0.044205 2021-11-10 2021-11-10 44327.15 44336.865 9
3 s_5777 GYR_X_DEG -0.029533 2021-11-10 2021-11-10 44327.17 44336.856 9
4 s_5777 GYR_Y_DEG -0.014959 2021-11-10 2021-11-10 44327.17 44336.860 9

As per the alignment of our gyroscope, our aim is to monitor the changes in Y- coordinate of the gyroscope to determine the rotation of handle for activation of our indicator.

In [10]:
# taking data with Y deg coordinates
data1=data[data.data_item_name=="GYR_Y_DEG"]

# display first 5 rows
data1.head()
Out[10]:
device_uuid data_item_name value ts_date received_ts_date timestamp received_timestamp difference
1 s_5777 GYR_Y_DEG -0.076319 2021-11-10 2021-11-10 44327.150 44336.86 9
4 s_5777 GYR_Y_DEG -0.014959 2021-11-10 2021-11-10 44327.170 44336.86 9
7 s_5777 GYR_Y_DEG -0.027170 2021-11-10 2021-11-10 44327.190 44336.86 9
10 s_5777 GYR_Y_DEG 0.020148 2021-11-10 2021-11-10 44327.214 44336.86 9
13 s_5777 GYR_Y_DEG 0.020454 2021-11-10 2021-11-10 44327.230 44336.86 9

In addition to the values, the sign can indicate the rotation direction of the vehicle.

A positive value means that the scooter is turned in left direction and negative value means the scooter is turned in the right direction.

Let's create a column indicating the direction of turn

In [11]:
# the new column will be a categorical column containg two categories
# Right:- For a +ve value the direction is right
# Left:- For a -ve value the direction is left

# list to store the directions
turn_direction=[]

# iterating through each datapoint 
for i in range(data1.shape[0]):

  # getting the value at the datapoint
  val=data1.iloc[i].value

  # if the value is greate than 0 then Right
  if val<0:
    turn_direction.append("Right")
  
  # if value is less than 0 then left
  elif val>0:
    turn_direction.append("Left")

  # for the third case if y coordinate value is 0 
  # this indicates that the vehicle is moving up streight 
  # so no need to turn in any direction
  else:
    turn_direction.append("No Movement")

# add the new data column to the data
data1["turn_direction"]=turn_direction

# First 5 rows
data1.head()
C:\Users\pulki\Anaconda3\lib\site-packages\ipykernel_launcher.py:29: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Out[11]:
device_uuid data_item_name value ts_date received_ts_date timestamp received_timestamp difference turn_direction
1 s_5777 GYR_Y_DEG -0.076319 2021-11-10 2021-11-10 44327.150 44336.86 9 Right
4 s_5777 GYR_Y_DEG -0.014959 2021-11-10 2021-11-10 44327.170 44336.86 9 Right
7 s_5777 GYR_Y_DEG -0.027170 2021-11-10 2021-11-10 44327.190 44336.86 9 Right
10 s_5777 GYR_Y_DEG 0.020148 2021-11-10 2021-11-10 44327.214 44336.86 9 Left
13 s_5777 GYR_Y_DEG 0.020454 2021-11-10 2021-11-10 44327.230 44336.86 9 Left

DATA ANALYSIS

Analysing the data to get insights about it.

Target:-

  1. We need to find pattern to get knowledge about the turning radious, for activation of the indicaotr

  2. Next we need to get an idea about the time duration up to which we need to activate the indicator

In [12]:
# importing required libraries

import seaborn as sns
import plotly.express as px
import warnings
warnings.simplefilter("ignore")
C:\Users\pulki\Anaconda3\lib\site-packages\statsmodels\tools\_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm
In [13]:
# plotting a scatted plot for values column in the dataset
# The color of the data determines the difference of the timestamps

# y axis represents the values
y=data1.value

# x axis has the data range
x= list(i for i in range(data1.shape[0]))

# plotting the scatter plot
fig = px.scatter(x=x, y=y , color=data1.difference, labels={
                     'x': "Count",
                     'y': "Gyroscope Reading",
                     'color':'difference'
                     
                 }, template = 'plotly_dark')

fig.show()

KEY INTAKES FROM THE PLOT

  1. Most of the values of the reading read between -0.5 to 0.5
  2. The time duration is below 150 seconds for maximum interval
  3. The time duration above 150 seconds may indicate the standing lock time of the vehicle.
In [14]:
# plotting the frequency of each difference in interval
# the difference determines the activation time of gyroscope reading

# required imports
from collections import Counter, OrderedDict

# The Counter function will return a frequency distribution dictonary 
# having the dataitems as keys and their occurance frequency as value
# the ordered dict will sort the dictonarey on basis of the occurance frequency 
# this will then be typecasted to a dictornary
# the result will be a dictonary sorted by values for frequency of occurance of each time interval
count=dict(OrderedDict(Counter(data1.difference)))


# basic bar chart 
# the x axis will be the keys
# the y axis will have the values
# plotting a bar plot for first 15 values 
fig = px.bar(data1, x=list(count.keys())[:15],
             y=list(count.values())[:15], labels={
                     'x': "Differences",
                     'y': "Occurance",
                     
                 }, template = 'plotly_dark')

fig.show()

KEY INTAKES FROM THE PLOT

  1. The top 5 values based on frequency are between 5 to 10 where 9 is on the top
  2. The difference is widely ranging in the 0-20 seconds region
  3. These may indicate the small turns or lane changes that may require indicator for few seconds only.
In [15]:
# scatter plot for the above same with complete data
fig = px.scatter(x=list(count.keys()), y=list(count.values()) , labels={
                     'x': "Count",
                     'y': "Occurances",
                     'color':'difference'
                     
                 }, template = 'plotly_dark')

fig.show()
In [16]:
# Plot showing the turning frequency for the data
# required Import
from collections import Counter

# get frequency dictonary for each label
mydict=dict(Counter(data1['turn_direction']))

# plot the data
fig = px.bar(data1, x=list(mydict.keys()),
             y=list(mydict.values()), labels={
                     'x': "Turn_Indication",
                     'y': "Count",
                     
                 }, template = 'plotly_dark')

fig.show()
In [17]:
# Plot to categorize the gyroscope readings as turning or non turning
# All the values which are 0 or close to 0 (up to 2 decimal places) are considered as zero

# intialize counters to store the value
close_to_zero=0
others=0

# loop over the datapoints
for i in range(data1.shape[0]):
 
 # get value at particular datapoint
  val=data1.iloc[i].value

  # if value is less than 1 and greater than -1 as well as not 0
  if val>-1 and val < 1 and val!=0:

    # converting to string for ease of calculation
    v=str(val)

    # split on basis of decimals as the unit places has 0 due to above if condition 
    v=v.split(".")

    # the the string after decimal place
    after_zero=v[1]
    
    # if the first and second index has 0 in it 
    # this indicate that the number is close to zero
    # as we are considering the number close to 0 for two decimal places
    if after_zero[0]=="0" and after_zero[1]=="0":
      close_to_zero+=1
    
    # else the number is greater than zero
    else:
      others+=1
  
# create a dictonary to store the information in an orderly way
mydict={"close_to_zero":close_to_zero,
        "Greater_than_zero":others}

# plot a box plot to show the result
fig = px.bar(data1, x=list(mydict.keys()),
             y=list(mydict.values()), labels={
                     'x': "Turn_Indication",
                     'y': "Count",
                     
                 }, template = 'plotly_dark')

fig.show()

The plot shows that there are more than 31315 instances where the change in the gyroscope reading doesnot indicate the actual turning of vehical or the need of an indicator.

The deflection in the reading may be due to a fact that the vehicle must have done a slight deflection.

PROPOSED SOLUTION

After going through the data, analysing the trends, following are the proposed solution and suggestions for automation of an indicator.

  1. For gyroscope readings for close to zero ( readings up to zeros in two decimal places ) no need for activation of indicator.
  2. The indicator activation will be in two different levels
  • Indicator_lv1-> the gyroscope reading is in between 0.005 for both -ve and +ve direction.
  • Indicator_lv2-> the gyroscope reading is more than 0.005 but less than 1.
  1. Another key point to note is that for indicator_lv1 the avg time of indicator cutoff will be after 10 sec.
  2. For indicator_lv2 the avg time will be between 150 seconds( This is refrenced to the waiting time during a red light which is at max 120 seconds ).
  3. For indicators with time more than avg time 2 the indicator wil automatically turn of and can be manually activated again.
In [18]:
# The code is written by Pulkit Dhingra 

# importing required libraries
import time
import threading

# Solution algorithm
class indicator():
  
  # constructor
  def __init__(self):

    # The lately_on variable will indicate wether the indicator was on earlier or not
    self.lately_on=False

  # Function to check if the indicator is required or not
  def turn_indicator(self, y):
    
    # if 0 or close to 0
    if y == 0 or y <= 0.005 and y >= -0.005:
      return False

    # else return 0, not close to zero turn detected
    else:
      return True

  # function determining the indicator direction
  def indicator_direction(self, y):
    if y < 0:
      return "Right"
    else:
      return "Left"

  # indicator intensity
  def intensity(self, y):

    # if the indicator was on currently
    if self.lately_on:

      # turn off the lately on
      self.lately_on = False

      # return the lv 2 indicator setting
      return 2
    
    # not lately on
    else:

      # check for lv1 setting
      if y > 0.005 and y <= 1 or y < -0.005 and y > -1 :
        
        # Turn on the lately on key
        self.lately_on = True

        # return the lv1
        return 1
      
      # lv 2 setting
      else:

        # if lately_on was currently active        
        if self.lately_on:
          
          # Turn off the lately on
          self.lately_on = False

        # if lately_on was not active
        # the indicator was not active earlier
        else:

          # Turn on the lately on
          self.lately_on = True

        # return lv2 setting
        return 2
  
  # function for display and maintaing time duration
  def time_lv(self,level,direction):

    # start timing
    start=time.time()
    
    # for lv1 intensity indicators
    if level==1:

      # stop timing is 10 sec + start
      stop=start+10

      # loop over for 10 sec
      while start!=stop:

        print(direction)
        
        # incrimenting start timing
        start+=1
        
        # sleep for a second
        time.sleep(1)
    
    else:

      # stop timing is 150 sec + start
      stop=start+150
      
      # loop over for 150 sec
      while start!=stop:

        print(direction)
        
        # incriment start
        start+=1
        
        # sleep for a sec
        time.sleep(1)

    return self.lately_on

  # controlling the lately_on activation time
  def lately_on_controller(self):

    # start timer
    start=time.time()
    
    # timer for five seconds
    stop = start+5

    # loop over to take 5 second
    while start != stop:
      time.sleep(1)
    
    # Set lately_on to false if no activity
    self.lately_on=False


# main
if __name__=="__main__":
  
  # take the X coord Y coord and Z coord readings 
  # these are the default readings that a gyroscope sensor gives 
  X,Y,Z=map(float,input().split())
  #X,Y,Z=0,0.9,1.9
  
  # object
  ather=indicator()

  # check if we need to turn on the indicator or not
  if ather.turn_indicator(Y):

    
    # Get the direction of the indicator where we need to turn
    dir=ather.indicator_direction(Y)

    # Get the intensity of indicator
    inten=ather.intensity(Y)

    # Turn the indicator on
    on=ather.time_lv(inten,dir)

    # if the lately_on is active
    if on:
      
      # start a thread to monitor the next activity
      # if no idicator is activated for next five seconds 
      # the lately_on will be automatically be reset to False
      t1=threading.Thread(target=ather.lately_on_controller)
    
    # lately_on is not active
    else:
      pass

  # indicator is not active
  else:
    pass
0 0.9 0
Left
Left
Left
Left
Left
Left
Left
Left
Left
Left
In [ ]: